Adaptive Searching in Succinctly Encoded Binary Relations and Tree-Structured Documents

نویسندگان

  • Jérémy Barbay
  • Alexander Golynski
  • J. Ian Munro
  • S. Srinivasa Rao
چکیده

The most heavily used methods to answer conjunctive queries on binary relations (such as the one associating keywords with web pages) are based on inverted lists stored in sorted arrays and use variants of binary search. We show that a succinct representation of the binary relation permits much better results, while using space within a lower order term of the optimal. We apply our results not only to conjunctive queries on binary relations, but also to queries on semi-structured documents such as XML documents or file-system indexes, using a variant of an adaptive algorithm used to solve conjunctive queries on binary relations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Search Algorithm for Patterns, in Succinctly Encoded XML

We propose an adaptive algorithm for context queries (queries expressed as preorder and ancestordescendant relations on labeled nodes), which can be used to find patterns in XML documents. Our algorithm takes advantage of the correlation between terms of the query without any preprocessed information, and it runs in time (kd(lg lg min(n,s)+lg lg(r))) in the RAM model, where k is the number of t...

متن کامل

Adaptive Searching in Succinctly Encoded Binary Relations and Tree-Structured Documents Extended Abstract

This paper deals with succinct representations of data types motivated by applications in posting lists for search engines, in querying XML documents, and in the more general setting (which extends XML) of multi-labeled trees, where several labels can be assigned to each node of a tree. To find the set of references corresponding to a set of keywords, one typically intersects the list of refere...

متن کامل

Ternary Tree and Memory-Efficient Huffman Decoding Algorithm

In this study, the focus was on the use of ternary tree over binary tree. Here, a new one pass Algorithm for Decoding adaptive Huffman ternary tree codes was implemented. To reduce the memory size and fasten the process of searching for a symbol in a Huffman tree, we exploited the property of the encoded symbols and proposed a memory efficient data structure to represent the codeword length of ...

متن کامل

Space-efficient Data Structures for Collections of Textual Data

This thesis focuses on the design of succinct and compressed data structures for collections of string-based data, specifically sequences of semi-structured documents in textual format, sets of strings, and sequences of strings. The study of such collections is motivated by a large number of applications both in theory and practice. For textual semi-structured data, we introduce the concept of ...

متن کامل

Searching structured documents

Structured document interchange formats such as XML and SGML are ubiquitous, however information retrieval systems supporting structured searching are not. Structured searching can result in increased precision. A search for the author “Smith” in an unstructured corpus of documents specializing in iron-working could have a lower precision than a structured search for “Smith as author” in the sa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Theor. Comput. Sci.

دوره 387  شماره 

صفحات  -

تاریخ انتشار 2006